9 research outputs found

    Hierarchical Memory Size Estimation for Loop Transformation and Data Memory Platform Optimization

    Get PDF
    In today’s embedded systems, the memory hierarchy is rapidly becoming a major bottleneck in terms of power, performance and area, due to the very large amount of (memory related) data need to be transferred and stored (temporarily). This is especially the case for portable multi-media applications systems. These applications are characterized by deep loop nests and multi-dimensional arrays at the high level. Due to the dramatically increasing size and complexity of system-on-a-chip (SoC) designs and stringent time-to-market requirement, the methodology and tools for chip design must be raised to the system level. Early analysis tools are particularly critical in enabling SoC designers to take full advantage of the many architectural options available. For memory optimization, the early high level techniques aim either to design an optimal memory platform for a given application or to optimize the application code in order to take advantage of the memory platform features, or even both. Loop transformation is such an important high level optimization technique. It modifies the execution order of loops and statements without changing the application functionality. Existing loop transformation algorithms are all performed based either on reduction of data access lifetime and on improvement in data locality and regularity to steer selection of loop transformations. These are, however, very abstract cost functions which do not represent the exact memory size requirement of the arrays and how the data will be mapped onto the memory platform later on. Existing algorithms all result in one final loop transformation solution. As different loop transformations may result in optimal utilization for different memory platform instances, ad-hoc decisions at this stage without estimating their impact on the actual hierarchy utilization can lead to a final sub-optimal solution. An evaluation of later design stages’ effort is hence required. On the other hand, there usually exist a huge number of loop transformation possibilities, the estimation is required to be performed repeatedly and its computation time of the estimation technique also becomes critical to make it useful during the loop transformation search space exploration. This dissertation proposes a memory footprint estimation methodology. An intra-array memory footprint estimation is performed first followed by an interarray estimation. In order to achieve a fast estimate to make it useful repeatedly during the early high level search space exploration, several techniques have been introduced. A fast intra-array memory footprint estimation is performed at the iteration domain based on the maximal lifetime of data accesses, which is defined by the maximal dependency vector. Two approaches, an ILP formulation and vertexes approach, have been introduced for achieving a fast maximal dependency vector calculation. The fast inter-array estimation has been achieved based on several Hanoi tower based approaches. A hierarchical memory size estimation methodology has also been proposed in this dissertation. It estimates the influence of any given sequence of loop transformation instances on the mapping of application data onto a hierarchical memory platform. As the exact memory platform instantiation is often not yet defined at this high level design stage, a platform independent estimation is introduced with a Pareto curve output for each loop transformation instance. It can steer the designer or an automatic steering tool to select all the interesting loop transformation instances that might later lead to low power data mapping for any of the many possible memory hierarchy instances. This is useful when the memory platform is not defined yet, or for a given memory hierarchy instance. It also allows to find the most appropriate low power memory hierarchy instance by performing an early power estimation of different memory hierarchy instances. Initially the source code is used as input for estimation, resulting in an initial approach. However, performing the estimation repeatedly from the source code is too slow for the large loop transformation search space exploration. An incremental approach, based on local updating of the previous result, is thus introduced to handle sequences of different loop transformations. Several advanced techniques have also been used on these two approaches in order to perform a fast estimation, such as bounding box geometrical model based data reuse analysis, platform independent memory hierarchy layer assignment estimation, fast intra- and inter-array memory footprint estimation. The feasibility and usefulness of the methodologies are substantiated using several representative real-life application demonstrators. It shows for instance that the fast memory footprint estimation can be two order of magnitude faster than compared techniques while still achieving fairly accurate estimation result. For hierarchical memory size estimation methodology, the initial approach is two order of magnitude faster than the compared technique and the incremental approach is another two order of magnitude faster than the initial approach, which can just take a few milliseconds. The fast computation time of the incremental approach make it feasible to be used repeatedly during the loop transformation exploration over a very large number of possibilities. Furthermore, prototype CAD tools has been developed that includes mast parts of the methodologies

    AMS measurement of 53Mn and its initial application at CIAE

    Get PDF
    The determination of cosmogenic 53Mn in terrestrial archives has important applications, such as burial ages, exposure age and erosion rates. Accelerator mass spectrometry (AMS) is the most sensitive technique to detect minute amounts of 53Mn. 53Mn measurements were developed at the China Institute of Atomic nergy (CIAE) using the DE-Q3D equipped AMS system. This approach was recently optimized with the goal to reach the sensitivity required for AMS measurements of 53Mn in deep-sea ferromanganese crust (DSFC) samples. Based on these improvements of sample preparation, current beam transmission and so on, 53Mn in two samples of DSFC was measured by AMS. The ratios of 53Mn/Mn corresponding to an age of 3.77 ± 0.42 and 13.73 ± 2.74 Ma by 129I dating method are (5.01 ± 2.15) 10 13 and (1.90 ± 0.96) 10 13. The ratios are close to the experimental reference values, deduced from the previous research. The experimental progress, performances and results are presented in this contribution.This work was mainly supported by the National Natural Science Foundations of China (NSFC), under Grant No. 11075221, and a partly supported by the National Natural Science Foundation of China under Grant Nos. 10705054, 41073044 and 11265005

    Memory requirement optimization with loop fusion and loop shifting

    No full text
    Loop fusion and loop shifting are well recognized loop transformations for memory requirement reduction. Stateof-the-art optimizations with loop fusion and shifting are based on heuristics without any evaluation of the resulting effects during each optimization step. Thus we cannot guarantee that each step results in a reduced overall memory requirement. On the other hand, most memory requirement estimations at system level are inefficient and slow. Also the estimation is not started until the optimization is done. Having to iterate between optimization and estimation is very time consuming. In this paper, we present a storage requirement optimization method which combines the optimization and estimation processes with the goal to have continuous estimates during the optimization and hence to achieve lower memory requirements. 1

    Hierarchical Memory Size Estimation for Loop Fusion and Loop Shifting in Data-Dominated Applications

    No full text
    Abstract — Loop fusion and loop shifting are important transformations for improving data locality to reduce the number of costly accesses to off-chip memories. Since exploring the exact platform mapping for all the loop transformation alternatives is a time consuming process, heuristics steered by improved data locality are generally used. However, pure locality estimates do not sufficiently take into account the hierarchy of the memory platform. This paper presents a fast, incremental technique for hierarchical memory size requirement estimation for loop fusion and loop shifting at the early loop transformations design stage. As the exact memory platform is often not yet defined at this stage, we propose a platform-independent approach which reports the Pareto-optimal trade-off points for scratch-pad memory size and off-chip memory accesses. The estimation comes very close to the actual platform mapping. Experiments on realistic test-vehicles confirm that. It helps the designer or a tool to find the interesting loop transformations that should then be investigated in more depth afterward. I

    Practice and innovation in the operation and maintenance of HI-13 tandem accelerator for 35 years

    No full text
    The HI-13 tandem accelerator, located at the Beijing Tandem Accelerator National Laboratory, has been in operation for 35 years. To ensure the continued performance of the accelerator, the operation and maintenance team has prioritized focus on various aspects. The operation team conducted research that involved developing key components, cultivating a high-quality operational team, improving the machine time efficiency, and increasing the participation of users outside the China Institute of Atomic Energy (CIAE). The primary emphasis has been on developing key components and upgrading subsystems. These efforts have successfully maintained and improved the accelerator's performance, ensuring its safe and stable operation. Finally, the paper alse discusses the challenges faced by tandem accelerators and presents future development plans
    corecore